351: fix: Don't depend on Serde to provide fields in the right order#9
351: fix: Don't depend on Serde to provide fields in the right order#9martin-augment wants to merge 1 commit intomainfrom
Conversation
WalkthroughThis pull request modifies Avro serialization to handle out-of-order struct fields and enforce default values for skipped fields. Changes include updating the ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
avro/src/error.rs(1 hunks)avro/src/ser_schema.rs(8 hunks)avro/tests/avro-rs-226.rs(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
avro/src/ser_schema.rs (4)
avro/src/error.rs (1)
new(38-42)avro/src/writer.rs (2)
new(95-97)schema(171-173)avro/src/ser.rs (13)
new(56-63)new(67-77)new(81-88)new(92-96)new(100-106)end(312-314)end(328-330)end(344-346)end(364-372)end(386-388)end(418-427)end(445-447)end(465-476)avro/src/schema.rs (5)
new(253-259)new(375-377)name(379-381)name(1208-1216)schema(7114-7114)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Cursor Bugbot
- GitHub Check: clippy_check (stable)
- GitHub Check: ci (1.86.0, ubuntu-24.04, x86_64-unknown-linux-gnu)
- GitHub Check: ci (stable, ubuntu-24.04-arm, aarch64-unknown-linux-gnu)
- GitHub Check: ci (beta, ubuntu-24.04-arm, wasm32-unknown-unknown)
- GitHub Check: claude-review
- GitHub Check: ci (stable, ubuntu-24.04, wasm32-unknown-unknown)
- GitHub Check: interop
- GitHub Check: web-assembly
🔇 Additional comments (8)
avro/src/error.rs (1)
517-525: LGTM! Error variant changes are well-structured.The change from
&'static strtoStringforfield_nameproperly supports dynamic field names from schema lookups. The newMissingDefaultForSkippedFieldvariant provides clear error messaging for skipped fields without defaults.avro/src/ser_schema.rs (4)
252-256: LGTM! Out-of-order field buffering design.The
field_cacheandnext_fieldtracking enables proper handling of fields that arrive out of schema order. The approach is straightforward and maintains correctness.
272-313: Well-implemented in-order vs out-of-order field handling.The logic correctly:
- Writes fields directly when in-order (fast path)
- Buffers out-of-order fields for later emission
- Flushes buffered fields as soon as their position is reached
The linear search in
field_cacheis acceptable for typical record sizes.
390-417: LGTM! skip_field correctly delegates to serialize_next_field.The implementation properly:
- Looks up the skipped field in the schema
- Serializes the default value through
serialize_next_field(maintaining field order)- Returns
MissingDefaultForSkippedFieldwhen no default is available
2982-3026: Good test coverage for out-of-order field serialization.The test correctly validates the core feature: a Rust struct with fields
(a, b)serialized against a schema with fields(b, a). The roundtrip through Writer/Reader confirms correct behavior.avro/tests/avro-rs-226.rs (3)
47-49: LGTM! Added required default for skip_serializing_if field.The
#[avro(default = "null")]annotation is necessary because the new implementation requires defaults for fields that may be skipped during serialization.
108-122: LGTM! Selective default annotations are correctly applied.The defaults are appropriately added only to fields that may be skipped during serialization:
skip_serializingandskip_serializing_if: need#[avro(default = "null")]skip_deserializing: correctly has no default (it's still serialized)skip: uses#[avro(skip)]to exclude from schema entirely
137-201: LGTM! Comprehensive panic tests for missing defaults.The three
#[should_panic]tests properly validate error handling for:
skip_serializing_ifwithout a defaultskipwithout a defaultskip_serializingwithout a defaultThe expected messages correctly match the
MissingDefaultForSkippedFielderror format.
| // Check if all fields were written | ||
| if self.next_field != self.record_schema.fields.len() { | ||
| let name = self.record_schema.fields[self.next_field].name.clone(); | ||
| return Err(Details::GetField(name).into()); | ||
| } |
There was a problem hiding this comment.
Dead code: This check can never trigger.
The while loop at line 317 only exits when self.next_field == self.record_schema.fields.len(). Therefore, the condition at line 348 is always false and this block is unreachable.
Remove the redundant check:
}
- // Check if all fields were written
- if self.next_field != self.record_schema.fields.len() {
- let name = self.record_schema.fields[self.next_field].name.clone();
- return Err(Details::GetField(name).into());
- }
assert!(
self.field_cache.is_empty(),
"There should be no more unwritten fields at this point"
);
Ok(self.bytes_written)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Check if all fields were written | |
| if self.next_field != self.record_schema.fields.len() { | |
| let name = self.record_schema.fields[self.next_field].name.clone(); | |
| return Err(Details::GetField(name).into()); | |
| } | |
| } | |
| assert!( | |
| self.field_cache.is_empty(), | |
| "There should be no more unwritten fields at this point" | |
| ); | |
| Ok(self.bytes_written) |
🤖 Prompt for AI Agents
In avro/src/ser_schema.rs around lines 347 to 351, the final check that returns
Err(Details::GetField(...)) is dead code because the preceding while loop always
exits only when self.next_field == self.record_schema.fields.len(); remove this
redundant if-block entirely (delete the lines that fetch the field name and
return the error) and ensure any trailing code and formatting remain consistent
after its removal.
There was a problem hiding this comment.
value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct! There is no need to check that the tracked position and the number of fields are the same because this is already checked in the earlier whilez condition. Prevents adding unused code.
|
Summary: This PR fixes issue apache#351 by handling cases where Serde serializes struct fields in a different order than the Avro schema expects. Positive Aspects:
Issues Found:
Performance Analysis:
Recommendations: Overall: Solid implementation with sound algorithm. Ready to merge after fixing the assert and dead code. HashMap optimization can be follow-up work. |
value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct! Using a HashMap will be both more ergonomic and succinct. There won't be a need to iterate over the items to find but it could use a map lookup directly. Prevents suboptimal performance |
value:useful; category:bug; feedback:The Claude AI reviewer is correct! There is no need to check that the tracked position and the number of fields are the same because this is already checked in the earlier whilez condition. Prevents adding unused code. |
value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct! A software library should avoid panicking if possible. In the current case it should return an Err like for any other problem earlier or to use debug_assertion!() macro that will report the problem in dev mode only. |
351: To review by AI